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Summary 

In  virtual  environments  (VE),  the  limited  field  of  view, 
the  lack  of  information  on  viewing  direction,  and 
possible  transmission  delays  may  be  considered  as 
potential  problems  in  developing  and  maintaining  a good 
sense  of  situation  awareness.  Enabling  unmanned  air 
vehicle  (UAV)  operators  to  use  high  quality  (proprio- 
ceptive) information  on  (changes  in)  viewing  direction 
by  introducing  a head- slaved  camera  system  with 
head-slaved  display  (HMD)  may  improve  situation 
awareness,  compared  to  using  a joystick  and  a fixed 
monitor.  However,  HMDs  may  degrade  comfort  and  the 
dynamics  of  head  movements.  Furthermore,  time  delays 
and  zoomed-in  images  induce  a non-steady  presentation 
of  the  environment,  and  may  impede  adequate  mapping 
of  spatial  information.  This  paper  reports  an  exploratory 
study  into  the  applicability  of  a head- slaved  camera 
system  in  unmanned  platform  applications.  To  overcome 
the  possible  drawbacks  of  HMDs,  we  compared  an 
HMD  with  a head-slaved  dome  projection  in  a simulator 
experiment.  To  overcome  the  possible  drawbacks  of 
transmission  delay,  w^e  introduced  a new  method  to 
compensate  for  the  spatial  distortions.  This  technique, 
called  delay-handling,  preserves  the  correct  spatial 
relation  between  the  viewing  direction  of  the  camera  and 
operator  by  presenting  incoming  images  in  the  camera 
viewing  direction,  and  not  in  the  actual  viewing  direction 
of  the  operator. 

The  experimental  results  showed  that  delay-handling  is 
successful  in  supporting  the  perception  of  correct  spatial 
relations,  i.e.,  it  improves  situation  awareness.  No 
differences  in  task  performance  were  found  between  the 
actual  HMD  and  the  dome  projection. 

Introduction 

In  operating  a Maritime  Unmanned  Aerial  Vehicle 
(MUAV)  the  flow^  of  information  is  very  poor  as 
compared  to  real  flying.  If  a human  operator  was 
physically  present  at  the  remote  site  and  performs 
manipulations  directly,  he  wwld  receive  a variety  of 
information  on  the  result  of  his  manipulations,  such  as 
visual,  auditory,  tactile,  and  force  feedback.  However, 
when  the  human  is  physically  separated  from  the  task 
space,  the  feedback  of  the  control  actions  has  to  be 
artificially  transmitted  back  to  him. 

The  man-machine  interface  determines  the  extent  to 
which  the  operator  can  sense  the  remote  environment 
and  consequently  control  the  platform.  Thus,  the  display 
and  controls  in  the  operator  environment  should  be 


designed  in  such  a way  that  the  operator  receives  task 
specific  information  and  sufficient  feedback.  The  images 
provided  by  an  on  board  camera  is  the  main  source  of 
information  on  the  outside  world  for  MUAV  operators. 
Because  of  the  inherent  characteristics  of  a camera- 
monitor  system,  and  the  restricted  data  link  between  the 
remote  site  and  the  operator,  these  images  are  of 
degraded  quality,  which  may  affect  steering  and  control 
performance  and  the  operator’s  situation  awareness 
(SA). 

Image  degradation  may  come  in  different  forms,  e.g.  a 
reduced  field  of  view,  a zoomed-in  image,  decreased 
information  about  the  camera  view^point  and  viewing 
direction,  a time  delay  between  the  control  input  and  the 
consequent  feedback,  and  reduced  spatial  and  temporal 
resolution.  It  is  plausible  that  the  degradation  of  some 
aspects  of  the  feedback  is  more  detrimental  for  operator 
performance  or  the  sense  of  SA  than  others;  some 
information  may  be  redundant  or  of  only  secondary 
value.  In  order  to  identify  the  limitations  that  may 
become  critical  for  the  sense  of  SA  when  the  operator 
manually  controls  MUAV  and/or  camera  movements  we 
first  reflect  on  the  concept  of  SA.  Next,  regarding 
MUAV  operators,  the  main  issues  that  affect  SA  will  be 
discussed.  Finally,  we  establish  wdiich  principles  of 
interface  design  may  support  the  operator  in  developing 
a good  sense  of  SA. 

In  teleoperation,  situation  awareness  may  be  defined  as 
the  operator’s  ability  to  perceive,  comprehend,  and 
predict  the  spatial  layout  of  the  elements  in  the 
environment.  SA  is  not  a static  phenomenon,  but  is 
composed  of  a variety  of  changing  facts,  interpretations 
and  predictions  in  the  context  of  task  requirements. 
Although  operator  performance  undoubtedly  depends  on 
SA,  their  exact  relationship  is  not  clear.  Actually,  there  is 
still  disagreement  among  researchers  as  to  just  what 
constitutes  SA,  However,  the  elements  of  SA  are  well 
known  and  include  such  familiar  human  functions  as 
perception,  information  processing,  decision-making, 
memory,  learning,  and  action-taking,  performed  within  a 
dynamic  set  of  environmental  circumstances  and 
conditions. 

SA  is  important  in  a wide  variety  of  environments. 
Acquiring  and  maintaining  SA  becomes  increasingly 
difficult  as  the  complexity  and  dynamics  of  the 
environment  increase.  Under  some  circumstances,  many 
decisions  are  required  within  a fairly  narrow^  time  span, 
and  task  performance  requires  an  up-to-date  analysis  of 
the  environment.  Because  the  state  of  the  environment  is 
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constantly  changing  (often  in  complex  ways)  a major 
portion  of  the  operator’s  job  becomes  that  of  obtaining 
and  maintaining  good  SA. 

Barfield,  Rosenberg  and  Furness  (1995)  describe  the 
main  components  of  situation  awareness:  spatial,  status, 
and  overall  situation  awareness.  Spatial  or  navigational 
awareness  deals  with  the  tliree-dimensional  geometry  of 
the  environment  and  refers  to  the  operator’s  mental 
model  of  the  vehicle’s  position.  What  is  my  position  and 
how  does  this  relate  to  the  position  of  other  objects?  The 
state  of  the  platform,  e.g.  the  amount  of  remaining  fuel, 
the  position  of  the  flaps,  is  represented  in  the  status 
component  of  awareness.  The  combination  of  spatial  and 
status  awareness  enables  an  overall  awareness  of  the 
total  flight  environment. 

Endsley  (1995)  gives  a more  elaborated  model  of  SA 
with  three  components.  Level  one  in  this  model  refers  to 
the  perception  of  the  elements  in  the  environment  and 
their  relationship  to  other  points  of  reference  (i.e. 
internal  model).  At  this  level,  relevant  characteristics 
(colour,  size,  speed  and  location)  and  the  dynamics  of 
the  objects  in  the  environment  are  represented.  This 
aspect  is  similar  to  what  Barfield  et  al.  (1995)  termed 
spatial  awareness.  Level  two  of  SA  goes  beyond  simply 
being  aware  of  the  elements  that  are  present,  and 
includes  an  understanding  of  the  significance  of  the 
elements.  Based  on  level  one  knowledge,  the  operator 
forms  a holistic  picture  of  the  environment,  compre- 
hending the  significance  of  objects  and  events.  Thus,  the 
integration  of  various  level  one  data  elements  at  level 
two  of  SA  is  crucial  for  the  comprehension  of  the 
situation.  Level  two  of  SA  can  be  highly  spatial  in  an 
operating  context.  The  relevance  of  different  objects  for 
the  operator’s  action  planning  will  depend  on  their 
location  and  speed.  Finally,  the  ability  to  project  the 
future  actions  of  the  elements  in  the  environment  forms 
the  third  and  highest  level  of  SA.  For  example,  in  traffic, 
knowledge  of  the  status  and  dynamics,  and  the 
comprehension  of  the  situation,  allows  a driver  to  predict 
the  future  actions  of  other  drivers  in  order  to  prevent 
collisions. 

Another  aspect  of  SA  should  be  mentioned  at  this  point. 
Although  SA  has  been  defined  as  a person’s  knowledge 
of  the  environment  at  a given  point  in  time,  it  is  highly 
temporal  in  nature.  That  is,  some  aspects,  like  the 
knowledge  about  the  dynamics  of  the  environment  and 
path  prediction,  are  acquirable  only  over  time. 
Smolensky  (1993)  discusses  the  work  of  Stein,  who 
showed  that  controller’s  eye  fixation  locations,  which 
had  varied  widely  in  the  initial  10  to  15  minutes  of  an  air 
traffic  simulation,  decreased  significantly  beyond  that 
point  in  time.  Anecdotally,  Stein’s  subjects  reported  that 
the  initial  10  to  15  minutes  of  a controllers  shift  is  the 
period  of  time  during  which  he  acquires  the  ‘big  picture’, 
or,  SA.  Another  temporal  aspect  of  SA  relates  to  the 
variations  in  relevance  of  elements  across  time.  Some 
elements  are  not  of  equal  importance  at  all  times, 
although  they  should  not  fall  out  of  consideration 


completely.  At  least  some  SA  on  all  elements  is  needed. 
SA,  therefore,  is  based  on  far  more  than  simply  the 
information  perceived  about  the  environment.  It  is 
related  to  a model  of  human  information  processing  in 
which  attention  and  long-term  memory  enable 
comprehending  the  meaning  of  information  in  an 
integrated  form.  Memory  does  not  only  serve  to  direct 
attention  effectively,  but  also  serves  to  interpret  the 
information  that  is  perceived  and  to  develop  accurate 
projections  of  future  events. 

SA  in  teleoperation 

In  teleoperation,  an  intervening  system  senses,  mediates, 
and  presents  information  to  the  human  operator.  In  this 
process,  a loss  of  information  can  occur,  which  may  be 
relevant  to  all  three  levels  of  SA. 

At  the  lowest  level,  the  system  may  fail  to  present 
certain  information  that  is  important  for  SA  in  the 
assigned  task.  First,  systems  may  only  present  informa- 
tion of  one  modality  (e.g.  only  visual  information),  based 
on  technological  limitations  and  the  designer’s  under- 
standing of  what  is  required.  Second,  the  information 
that  is  presented  may  lack  important  cues;  e.g.  no 
stereoscopic  depth  cues  when  a single  camera  is  used. 
Another  major  issue  in  teleoperation  is  the  transmission 
speed  and  capacity.  Intervening  communication  systems 
like  satellites  reduce  transmission  speed,  resulting  in 
delayed  feedback  to  the  operator  about  his  manipula- 
tions. 

For  level  two  SA,  the  information  displayed  by  the 
system  must  be  integrated,  and  related  to  a mental  model 
to  obtain  a holistic  picture,  and  to  determine  which  cues 
are  actually  relevant  to  the  established  goals.  When  no 
model  exists  at  all,  level  two  SA  must  be  developed  in 
memory.  The  absence  of  sufficient  level  one  SA,  the 
inability  to  develop  a sufficient  mental  model  or  the 
inability  to  properly  integrate  or  comprehend  the 
meaning  of  presented  data,  can  lead  to  inaccurate  or 
incomplete  level  two  SA.  This  may  be  caused  by 
incomplete  or  inaccurate  presentation  of  data  to  the 
human  operator,  or  by  a mismatch  between  information 
presentation  and  perceptual,  attentional,  and  working 
memory  characteristics  of  the  operator. 

Finally,  level  three  SA  may  be  lacking  or  incorrect.  Even 
if  the  mental  model  is  sufficient  for  level  two  SA,  and 
the  actual  situation  is  clearly  understood,  it  may  be 
difficult  to  accurately  project  future  dynamics.  Lack  of 
highly  developed  mental  model  and  attention  and 
memory  limitations  may  account  for  this.  Furthermore, 
some  people  are  simply  not  good  at  mental  simulation. 

Regarding  the  control  of  unmanned  platfomis,  loss  of 
SA  is  already  present  at  level  one  of  SA,  causing 
degraded  sense  of  SA  on  level  two  and  three  as  well.  The 
inability  to  assess  basic  properties  as  position,  direction 
and  speed  also  hampers  the  operator  in  developing  a 
correct  mental  model  (level  two),  and  in  making 
adequate  predictions  about  future  states  of  the  objects 
(level  three).  Part  of  the  problems  are  probably  related  to 
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the  poor  information  flow  specific  in  MUAV 
applications,  due  to  the  following  reasons: 

A small  field  of  view.  A limited  field  of  view  suppresses 
the  use  of  peripheral  visual  information.  The  peripheral 
area  of  the  retina  differs  anatomically  and  functionally 
from  the  foveal  area  (Schneider,  1969;  Trevarthen, 
1968),  and  is  used  to  generate  our  sense  of  spatial 
orientation  (Ungerleider  & Mishkin,  1982;  Jeannerod, 
1997).  For  example,  a human  operator’s  performance  in 
a disturbance  nulling  task  with  only  a central  field  of 
view  display  can  be  dramatically  improved  if  the  field  of 
view  is  expanded  to  cover  the  peripheral  retina  (Kenyon 
& Kneller,  1992). 

Furthermore,  a small  field  of  view  requires  a higher 
degree  of  integration  of  spatial  information  to  build  up  a 
representation  of  the  spatial  environment.  That  is,  rather 
than  having  a large  field  of  spatial  information  in  which 
several  objects  (and  terrain  features)  are  localised,  a 
smaller  field  of  view  affords  less  spatial  information  at 
any  instant,  which  forces  operators  to  integrate  these 
small  ‘pieces’  of  spatial  information  in  time.  The  results 
of  a search  and  replace  experiment  using  an  HMD 
(Venturino  & Kunze,  1989)  indicated  that  the  field  size 
affects  one’s  ability  to  acquire  spatial  information. 
However,  an  important  observation  in  this  experiment 
was  also  that  once  the  spatial  information  has  been 
mapped  into  spatial  memory,  humans  could  use  that 
information  independently  of  the  size  of  their  ‘window’ 
to  the  world.  This  phenomenon  is  also  found  by 
Thompson  (1983),  who  asked  subjects  to  walk  with 
closed  eyes  to  previously  viewed  targets,  and  Tyrell  et 
al.  (1993)  who  asked  visually  occluded  subjects  to 
position  a point  of  light  at  the  location  of  a previously 
viewed  target. 

A zoomed~in  image.  Often,  the  small  field  of  view  is 
combined  with  a zoomed-in  camera  image.  The 
zoom-factor  of  the  camera  disturbs  the  normal  relation 
between  rotational  speed  of  the  camera  and  translational 
flow  in  the  camera  image.  For  example,  Van  Erp, 
Korteling  and  Kappe  (1995)  found  that  operators  largely 
overestimate  camera  rotations  when  viewing  a 
zoomed-in  camera  image. 

Few  points  of  reference  at  sea.  The  lack  of  reference 
points  at  sea  may  hinder  the  operator  in  developing  a 
good  model  of  the  position  of  objects  in  the  remote 
environment  and  their  relations. 

Low  update  rate.  Update  rates  lower  than  4 Hz  limit  the 
perception  of  the  direction  and  speed  of  objects,  platform 
and  camera. 

Transmission  delays.  Transmission  delays  will  mainly 
lead  to  degraded  performance  of  the  operator  when 
manually  controlling  the  camera.  Eventually,  the 
operator  will  develop  a go-and-wait  strategy,  which  will 
hamper  developing  a sense  of  SA. 

Degraded  information  on  (changes  in)  the  viewing 
direction.  Controlling  the  viewing  direction  of  the 
camera  by  means  of  a joystick  while  the  images  are 
presented  on  a stationaiy'  monitor,  withhold  the  operator 
of  proprioceptive  feedback  on  viewing  direction. 
Normally  this  information  is  provided  by  muscle 


spindles  of  neck  and  eyes,  and  therefore  allows 
automatic  mapping  of  visual  information  on  a mental 
model.  Since  the  viewing  direction  can  not  be  directly 
deduced  from  the  camera  images,  it  is  usually  presented 
via  additional  indicators.  However,  this  information 
requires  the  operator  to  perform  some  kind  of  cognitive 
processing  in  order  to  build  a mental  model,  and  it  is  not 
intuitive  and  therefore  slow. 

In  previous  research,  it  was  shown  that  introducing  high 
quality  synthetic  visual  information  can  partly  cancel  out 
problems  regarding  the  zoomed-in  camera  image,  the 
lack  of  reference  points,  the  low  update  rate  and  the 
transmission  delay,  which  all  have  an  important  camera 
control  component  (Van  Erp,  Kappe  & Korteling,  1996). 
Field  size  and  information  on  viewing  direction  may  be 
considered  as  the  most  important  factors  related  to  SA  in 
unmanned  platform  applications.  Moreover,  both  factors 
probably  interact  strongly.  Although  spatial  information 
can  be  used  effectively  regardless  of  the  size  of  the 
‘window’  to  the  world  once  it  is  stored  in  spatial 
memory;  the  lack  of  infoimation  about  the  viewing 
direction  of  the  camera  hinders  the  building  of  a mental 
representation,  and  the  integration  of  new  information. 

Head-slaved  camera  control 

A possibility  to  convey  high  quality  information  about 
camera  viewing  direction  is  the  use  of  a head-slaved 
camera  system.  When  the  viewing  direction  of  the 
camera  is  coupled  to  the  viewing  direction  of  the 
operator,  proprioceptive  information  is  available,  which 
can  be  interpreted  automatically.  Automatic  processing 
tends  to  be  fast,  autonomous,  effortless,  and  unavailable 
to  conscious  awareness  in  that  it  can  occur  without 
attention.  It  is  hypothesised  that  system  designs  that 
support  automatic  processing  of  information  directly 
benefit  performance. 

Applying  a head-slaved  camera  system  also  requires  a 
head  coupled  image  presentation  (i.e.  a head  mounted 
display,  HMD)  instead  of  a fixed  monitor,  see  Kappe, 
Van  Erp  and  Korteling  (in  press).  However,  the  use  of 
head- slaved  camera  control  in  combination  with  an 
HMD  also  has  two  potential  drawbacks. 

First,  HMDs  may  influence  comfort  and  control 
behaviour  of  the  operator.  Kotulak  and  Morse  (1995) 
discuss  a survey  of  58  aviators  by  Behar,  who  found  that 
51%  had  visual  discomfort,  35%  had  headache,  and  2\% 
had  blurred  vision.  These  symptoms  could  have  a 
common  origin:  eye-head  co-ordination  could  be 

affected  by  HMD  characteristics,  and  smaller  field  sizes 
place  heavy  demands  on  head  movements,  since  subjects 
must  move  their  heads  to  sample  the  environment  rather 
than  using  the  more  effortless  joystick  control.  A study 
by  Gauthier,  Martin  and  Stark  (1986)  suggests  that  the 
greater  head  inertia  associated  with  HMDs  may  induce  a 
decrease  in  the  amplitude-velocity  relationship  of  head 
movements,  i.e.  slowing  of  head  movement  and  small 
changes  in  head  amplitude.  Further,  eye  movements  may 
change  secondary  to  these  changes  in  head  velocity.  Eye 
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movement  maximum  amplitude  and  velocity  increase 
with  increasing  inertia.  Gauthier  et  al.  (1986)  studied 
these  effects  of  added  head  inertia  and  discuss  that 
oscillopsia  (continuous  displacement  or  instability  of  the 
visual  world)  was  prominent  and  consistent  in  perceptual 
reports  of  their  subjects. 

Second,  transmission  delays  may  distort  the  correct 
relation  between  the  external  environment  and  the 
perceived  visual  array.  Because  the  images  on  an  HMD 
are  presented  in  the  actual  viewing  direction  of  the 
operator,  a transmission  delay  introduces  a discrepancy 
between  the  viewing  direction  of  the  camera  at  the 
moment  the  images  were  recorded  at  the  remote  site,  and 
the  viewing  direction  of  the  operator  at  the  moment  the 
images  are  presented.  This  results  in  the  operator 
perceiving  the  world  as  unstable  when  he  moves  his 
head.  For  example,  when  the  operator  has  a steady  image 
of  an  object,  moving  his  head  will  'drag’  it  across  the 
environment  during  the  transmission  delay.  Therefore, 
transmission  delays  will  probably  impede  adequate 
spatial  mapping  of  the  visual  information. 

A possibility  to  reduce  the  first  drawback  (comfort)  is  to 
project  the  images  in  a moving  window  projected  onto  a 
dome,  instead  of  on  an  HMD.  A possibility  to  prevent 
the  second  drawback  (delay)  is  to  display  the  images  in 
the  viewing  direction  of  the  camera  at  the  moment  of 
recording,  and  not  in  the  actual  viewing  direction  of  the 
operator  (called  delay-handling  throughout  the  paper). 
This  results  in  an  image  location  which  corresponds  with 
the  image  content,  and  follows  the  actual  viewing 
direction  of  the  operator  with  a delay,  instead  of  an 
image  location  which  corresponds  with  the  actual 
viewing  direction,  but  not  with  the  image  content. 

In  case  the  field  of  view  on  the  environment  has  the 
same  size  as  the  field  of  presentation  (which  is  defined 
as  the  size  of  the  display  on  w^hich  the  view  on  the 
environment  can  be  presented,  e.g.  the  size  of  the  dome), 
the  principle  of  delay-handling  will  lead  to  image  loss  on 
the  side  contra-laterally  to  the  direction  of  motion. 
Therefore,  the  field  of  presentation  must  preferably  have 
spare  space  to  overcome  this  loss.  In  this  respect,  domes 
are  preferable.  The  size  of  this  spare  space  and  the 
transmission  delay  determine  the  maximum  speed  the 
camera  can  rotate  without  image  loss. 

Experiment 

The  present  exploratory  experiment  was  used  to 
investigate  the  possibilities  of  head-slaved  camera 
control  for  unmanned  platfomis.  To  elaborate  on  the 
possible  drawbacks  mentioned  above,  w^e  used  two 
presentation  modes:  a head-mounted  display,  and  a 
moving  window  on  a dome;  and  we  introduced  different 
transmission  delays  and  tested  the  principle  of  delay- 
handling. To  test  the  effect  on  the  operator’s  sense  of 
SA,  we  developed  an  experimenta]  task,  which  included 
level  one,  two  and  three  of  SA  as  defined  by  Endsley 
(1995). 


Subjects 

Seven  college-educated,  right-handed  male  subjects 
(age:  20  to  27  years)  participated  in  the  experiments.  All 
subjects  had  normal  or  corrected  to  normal  vision,  were 
paid  for  their  participation,  and  had  no  experience  with 
similar  operator  tasks. 

Apparatus 

All  images  were  generated  by  a three-channel  Evans  and 
Sutherland  ESIG  2000  image  generator  (30  Hz  update 
rate).  The  images  were  presented  via  a head  mounted 
display  (N-Vision,  41.5°  34.5°,  800x600  pixels  HxV), 

or  via  a projection  screen  (a  Seos  PROD  AS  HiView  S- 
600  projection  system,  consisting  of  a spherical  dome 
and  three  video  projectors;  radius  2.9  m,  150°  x 42°, 
2400x600  pixels  HxV).  The  subject’s  head  was 
positioned  in  the  centre  of  the  dome.  Head  orientation 
(horizontal  and  vertical)  was  registered  by  a Polhemus 
Fastrack  head-tracker  (resolution  0.15°,  30  Hz),  with  the 
sensor  coil  either  mounted  on  the  HMD  or  on  a 
lightweight  plastic  helmet  (weight  <0.1  kg).  Minimum 
delay  between  head-tracking  and  displaying  was  about 
60  ms.  Head  tracker  data  was  used  as  input  for  the 
mathematical  model  (ran  with  30  Hz  on  a 486-based 
PC),  which  calculated  the  motions  of  the  simulated 
(head-slaved)  camera  and  the  objects  in  the  database. 
The  mathematical  model  also  simulated  the  transmission 
delay  between  the  camera  and  the  operator,  by  using  a 
pipeline  with  a size  of  30  times  the  transmission  delay 
(s).  A second  486-based  PC  was  used  for  scenario 
generation  and  data  storage  (30  Hz  sampling  frequency). 
The  presented  view  on  the  environment  (window)  had  a 
size  of  13.3°  x 10.0°,  and  could  be  projected  in  the 
actual  viewing  direction,  or  in  the  viewing  direction  of 
the  camera  for  which  the  images  were  generated.  Note 
that  with  a transmission  delay  this  resulted  in  a delayed 
image  content  and  a delayed  image  location, 
respectively. 

The  subject  was  seated  in  a chair  with  a right  armrest,  on 
which  a spring-loaded  joystick  was  mounted.  A response 
button  was  mounted  on  top  of  the  joystick  (Figure  1). 


Figure  1:  An  overview  of  the  TNO  MUAV- 
simulator  facility 


Task 

The  camera-platform  remained  at  a fixed  position  and 
orientation  throughout  the  experiment,  altitude  of  500 
feet.  The  virtual  environment  depicted  by  the  camera 
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image  consisted  of  a textured  sea,  twelve  ships,  and  six 
square  so  called  oil-rigs.  The  oil-rigs  were  arranged 
along  imaginary  gridlines,  such  that  they  enclosed  an 
area  defined  by  parallel  and  perpendicular  lines  between 
the  rigs  (Figure  2).  This  area  was  defined  as  forbidden 
for  target  ships.  The  distance  between  the  platforms  was 
1000-2000  feet 

Six  moving  ships  of  equal  type  were  defined  as  targets; 
the  other  six  ships  were  distracters,  were  of  a different, 
smaller  type  and  had  to  be  neglected.  The  targets  moved 
at  45  feet/s  along  a winding  route  that  was  unknown  to 
the  subject,  and  had  a maximum  turn  rate  of  37s.  The 
ships  headed  for  an  end  position  within  the  forbidden 
area. 

Overall  task  instruction  was  to  give  a signal  when  a 
target  ship  entered  the  forbidden  area,  which  actually 
consists  of  the  following  parts: 

• determine  the  form  and  location  of  the  forbidden  area 
by  detecting  the  position  of  the  oil -rigs,  and  drawing 
imaginary  borders, 

• detect  and  monitor  the  position  and  track  of  the  target 
ships, 

• give  a signal  whenever  a target  ship  enters  the 
forbidden  area. 

This  experimental  task  was  designed  to  implement  the 
different  levels  of  SA  as  introduced  by  Endsley  (1995). 
Level  one  refers  to  the  position  of  the  oil-rigs  and  the 
ships,  their  attributes,  and  their  spatial  relations  in  the 
environment.  Level  two  refers  to  comprehending  the 
significance  of  the  different  elements:  which  ships  are 
targets,  and  which  targets  are  heading  for  the  forbidden 
area.  Level  three  refers  to  the  need  to  predict  the  future 
position  of  targets,  e.g.  assess  which  of  the  targets  will 
reach  the  forbidden  area  first. 

Birds  eye  view  on  the 


Target  ship 

Figure  2:  Illustration  of  a possible  alignment 
of  the  six  oil-rigs 

At  the  time  that  one  of  the  targets  actually  crossed  a 
border  (marked  target  position  in  Figure  2),  subjects  had 
to  keep  the  ship’s  stem  in  the  centre  of  the  camera  image 
and  push  the  button  on  the  joystick.  The  target  ship 
disappeared  when  it  was  held  within  2°  of  the  centre  of 
the  image  at  the  time  of  the  response.  When  the  subject 
did  not  give  a response,  the  target  ship  automatically 
disappeared  when  it  reached  a predefined  end  position 


within  the  forbidden  area.  Wlienever  a target  ship 
disappeared,  a new  target  ship  was  placed  at  a different 
position  in  the  environment  to  keep  the  number  of  ships 
to  be  monitored  constant  during  a mn.  A run  was 
completed  when  six  target  ships  had  disappeared. 

During  the  run,  performance  was  recorded  in  order  to 
calculate  objective  performance  measures  afterwards. 
Furthermore,  after  the  completion  of  a session,  subjects 
were  given  a post-test  to  ascertain  that  they  had 
memorised  the  alignment  of  the  oil-rigs,  i.e.  if  they 
developed  a mental  model  of  the  world  during  a run.  A 
forced-choice  procedure  was  used,  in  which  the  subjects 
had  to  choose  the  actual  alignment  of  the  oil-rigs  out  of 
the  six  drawings  (bird’s  eye  view)  of  possible 
alignments. 

Independent  variables 

Three  independent  variables  were  manipulated  in  a full 
factorial  within  subjects  design:  presentation  mode 
(HMD  and  dome  projection),  delay-handling  (absent, 
present),  and  transmission  delay  (0,  0.5,  1.0,  2.0,  and  4.0 
s),  resulting  in  twenty  conditions. 

Dependent  variables 

The  following  performance  measures  were  used: 

• Time  to  locate  the  oil  rigs  (s).  The  measure  was 
defined  as  the  time  it  took  a subject  to  locate  all  six 
oil-rigs,  i.e.  the  time  until  the  camera  had  been 
pointed  at  all  of  the  six  platforms  at  least  once. 

• Time  to  border  crossing  (s).  The  measure  “time  to 
border  crossing”  for  each  target  was  calculated  as  the 
time  that  a target  was  away  from  the  border  to  be 
crossed  at  the  moment  of  the  response  of  the 
participant.  Time  to  border  crossing  was  taken  over 
all  targets  signalled  by  the  participant  (between  1 and 
6).  This  measure  reflects  the  accuracy  of  the  subjects 
in  estimating  the  position,  course  and  speed  of  the 
target  ship  relative  to  the  oil -rigs,  i.e.  their  accuracy 
in  the  perception  and  prediction  of  spatial  relations. 

• SD  heading  C).  The  measure  “SD  heading”  is 
defined  as  the  standard  deviation  of  the  heading  of 
the  viewing  direction  during  a single  run,  and  is  a 
measure  of  viewing  behaviour. 

• SD  pitch  C).  The  measure  “SD  pitch”  is  defined  as 
the  standard  deviation  of  the  pitch  of  the  viewing 
direction  during  a single  run,  and  is  a measure  of 
viewing  behaviour. 

• Midtiple  choice  on  platform  orientation.  This 
measure  was  calculated  as  the  number  of  correct 
choices  of  the  alignment  of  the  six  oil-rigs  (summed 
over  the  levels  of  transmission  delay). 

Statistical  design 

The  experiment  was  completed  in  sessions  consisting  of 
the  five  transmission  delay  levels  for  a combination  of 
presentation  mode  and  delay-handling.  These  blocks  of 
five  runs  were,  although  not  completely,  order-balanced 
across  the  subjects.  Within  each  block,  the  order  of 
transmission  delay  was  randomised.  For  each  subject. 
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the  twenty  scenarios  were  randomly  assigned  to  the 
conditions,  with  the  restriction  that  each  combination  of 
condition  and  scenario  occurred  only  once  throughout 
the  experiment. 

Each  dependent  variable  was  checked  for  outliers  (scores 
that  deviated  by  more  than  3 SD  from  the  overall  mean) 
and  sphericity.  Incidentally,  a large  score  on  the  time  to 
border  crossing  was  found.  Target  ships  could  approach 
a border  until  they  were  at  a short  distance  from  it,  but 
because  of  the  winding  route  they  moved  along,  not 
actually  cross  the  border.  Therefore,  values  greater  than 
20  s were  removed  from  the  analysis.  No  other  outliers 
were  found. 

Results  of  the  performance  measures  "‘time  to  locate  the 
oil-rigs”,  "Time  to  border  crossing”,  "‘SD  heading”,  and 
“SD  pitch”  w^ere  analysed  by  a within-subjects  design 
with  three  factors:  presentation  mode  (2)  x delay- 
handling  (2)  x transmission  delay  (5)  with  the  statistical 
package  STATISTICA  5.0.  Significant  results  were  further 
analysed  by  a post-hoc  Tukey  test.  Results  of  the 
multiple  choice  question  (only  one  observation  per 
session  of  five  runs)  w^ere  analysed  by  a within-subjects 
design  with  two  factors:  presentation  mode  (2)  x delay- 
handling  (2). 

Procedure 

First,  subjects  received  a brief  written  explanation  about 
the  general  nature  and  procedures  of  the  experiment.  The 
instructor  then  show^ed  the  projection  dome,  chair,  the 
plastic  helmet  and  the  HMD,  and  explained  the  purpose 
and  task  in  more  detail.  The  subjects  came  in  pairs:  one 
subject  performed  a session  of  five  runs,  preceded  by  a 
practice  run,  while  the  other  subject  rested.  The  practice 
run  was  with  no  transmission  delay,  was  not  registered, 
and  performed  with  a scenario  not  used  during  the 
experiment.  After  a session  the  subject  was  instructed  to 
perform  the  multiple-choice  task  in  a room  near  the 
room  in  which  the  dome  was  situated. 

Results 

Presentation  mode.  On  the  basis  of  experimental 
observations  (see  Gauthier  et  al.,  1986)  and  the  smaller 
field  of  presentation,  a disadvantage  of  the  HMD  was 
expected.  How^ever,  none  of  the  performance  measures 
show^ed  a significant  effect  of  presentation  mode. 

Delay-handling.  Two  dependent  variables  showed  a 
main  positive  effect  of  delay-handling.  Time  to  border 
crossing  showed  a performance  increase  of  15%  with  the 
presence  of  delay -handling  [means  5.8  s and  4.9  s, 
F(l,6)=23,91,  p<.01].  The  mean  number  of  correct 
answers  on  the  multiple  choice  task  increases  with  40% 
(means  2.4  and  3.4)  with  delay-handling  present,  F(l,6)= 
21.00,  /?<.01.  Delay-handling  showed  no  significant 
interactions. 

Transmission  delay.  Three  performance  measures 
showed  a main  effect  of  transmission  delay.  The  time 
needed  to  locate  the  oil-rigs  [F(4,24)=20.72,  p<.01],  the 
time  to  border  crossing  [F(4,24)=7.75,  /?<.01],  and  SD 


pitch  [F(4,24)=6.39,  p<  .01].  All  effects  showed 
performance  decline  with  increasing  transmission  delay. 
The  post  hoc  tests  indicated  that  performance  on  the 
former  two  variables  was  degraded  for  delays  larger  than 
0.5  s,  on  the  latter  only  for  a delay  of  4 s. 

Discussion 

The  present  study  concentrates  on  the  concept  of 
situation  awareness  (SA)  in  relation  to  camera  control  of 
unmanned  platforms  using  virtual  environment  (VE) 
techniques.  In  the  introduction,  it  was  hypothesised  that 
inherent  characteristics  of  the  man-machine  interface, 
like  the  limited  field  of  view  and  the  time  delay  between 
image  recording  at  the  remote  site  and  image 
presentation,  may  hamper  the  operator  in  developing  a 
good  sense  of  SA.  Providing  the  operator  with  high 
quality  information  on  (changes  in)  viewing  direction  by 
introducing  a head- slaved  camera  system  with  head- 
slaved  display  may  support  the  operator  and  improve 
SA.  However,  literature  also  shows  that  such  systems 
may  degrade  other  aspects,  e.g.  comfort,  control  strategy, 
and  the  spatial  relation  between  viewing  direction  of 
camera  and  operator  as  a result  of  transmission  delays. 
The  present  experiment  focussed  on  the  applicability  of 
head-slaved  camera  systems  in  MUAV  applications.  To 
overcome  possible  drawbacks  of  HMDs,  we  compared  a 
head  mounted  display  with  a head  slaved  dome 
projection  and  to  overcome  the  possible  drawbacks  of 
transmission  delay.  We  introduced  a mechanism  of 
delay-handling  which  preserves  the  correct  spatial 
relation  between  viewing  direction  of  the  camera  and  the 
operator  by  presenting  incoming  images  in  the  camera 
viewing  direction,  and  not  in  the  actual  viewing  direction 
of  the  operator.  A new  experimental  task  was  introduced 
to  include  the  different  levels  of  SA  as  discerned  by 
Endsley  (1995). 

The  results  show  no  significant  effect  of  presentation 
mode.  Although  mean  values  on  SD  heading  and  SD 
pitch  showed  higher  values  with  dome  projection  over 
the  HMD,  the  effects  did  not  reach  significance  {p^.\6 
and p^.lO,  respectively). 

The  results  indicated  a positive  main  effect  of  the 
principle  of  delay -handling  (depicting  the  delayed 
images  in  the  camera,  not  in  the  actual  head  direction). 
Both  the  results  of  the  time  to  border  crossing  and  the 
multiple  choice  task  show  performance  improvement 
when  delay-handling  is  applied.  Time  to  locate  all  oil- 
rigs and  control  behaviour  did  not  differ  with  delay- 
handling absent  or  present.  This  indicates  that  delay- 
handling  is  especially  useful  for  developing  higher  levels 
of  SA,  i.e.  in  deteimining  the  exact  spatial  relation 
between  the  oil -rigs  and  the  imaginary  borders  and  the 
targets. 

The  main  effect  of  transmission  delay  shows  that  this 
variable  both  degrades  the  development  of  the  sense  of 
SA  at  all  levels,  and  the  control  behaviour  of  the 
operator. 
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Because  delay -handling  results  in  a window  moving 
with  a delay,  the  available  field  of  presentation  must  be 
larger  than  the  field  of  view.  This  may  be  a disadvantage 
for  the  HMD  mode  of  presentation,  because  HMDs  have 
a restricted  field  of  presentation.  However,  the  lack  of  an 
interaction  presentation  mode  x delay-handling  shows 
that  the  field  of  presentation  of  the  presently  used  HMD 
was  sufficient. 

We  also  expected  an  interaction  between  delay-handling 
and  transmission  delay.  Increasing  transmission  delays 
will  disturb  the  spatial  relations  more  for  the  same 
control  signals,  and  was  therefore  expected  to  increase 
the  positive  effects  of  delay-handling.  Even  a third  order 
interaction  (presentation  mod  e x delay -handling  x 
transmission  delay)  might  have  been  present. 
Transmission  delays  were  supposed  to  be  compensated 
by  presenting  the  images  in  the  spatially  correct  viewing 
direction.  This  method  requires  a field  of  presentation, 
which  is  larger  than  the  size  of  the  camera  images,  and 
must  be  increased  with  increasing  time  delays.  Since  the 
field  of  presentation  of  the  HMD  is  restricted,  an 
additional  advantage  of  the  dome  projection  was 
expected  for  larger  transmission  delays.  However,  none 
of  the  interactions  was  found. 

Recommendations 

It  is  recommended  to  perform  human  factors  research 
aimed  at  further  improving  operator  performance  by 
optimising  interface  design.  Areas  of  interest  include  the 
following; 

• Directly  compare  the  effects  of  joystick  versus 
head-coupled  camera  control  on  the  sense  of  SA  and 
camera  control  performance. 

• Investigate  the  effects  of  a zoomed-in  camera  image 
on  head-coupled  camera  control.  The  zoomed-in 
camera  image  disturbs  the  relation  between  head 
rotations  and  translational  flow  in  the  image,  which 
may  be  confusing  and  uncomfortable  to  the  operator. 

• Further  explore  the  applicability  of  the  method  of 
delay-handling  in,  for  example,  situations  in  which 
the  camera  translates  through  the  remote  environ- 
ment, or  in  which  the  camera  image  is  zoomed-in. 

• Investigate  the  relation  between  man-machine 
interface  characteristics  and  the  different  levels  of 
SA,  and  develop  specific  operator  support.  An 
example  is  adding  high  quality  visual  information  to 
the  camera  image  to  provide  the  visual  information 
that  is  lost  in  some  situations,  e.g.  as  a consequence 
of  the  low  update  rate  of  the  image  (by  presenting 
visual  motion  information),  a zoomed-in  image  (by 
presenting  correct  translational  flow  for  camera 
rotations),  and  transmission  delays  (by  introducing  a 
predictive  display). 
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